Case Studies

Case 1: Phase III Two-Arm Group Sequential Trial

The RALES (Randomized Aldactone Evaluation Study) trial was a significant clinical study focusing on the effectiveness of an aldosterone receptor blocker compared to a placebo.

1. Design of the Trial

  • Type: The RALES trial was a double-blind, multicenter clinical trial. Being “double-blind” means that neither the participants nor the experimenters knew who was receiving the treatment or the placebo, which helps to prevent bias in the results.
  • Comparison: It involved a comparison between an aldosterone-receptor blocker and a placebo. Aldosterone receptor blockers are typically used to treat conditions like heart failure and hypertension by blocking the effects of the hormone aldosterone.

2. Endpoints and Objectives

  • Primary Endpoint: The primary endpoint of the trial was all-causes mortality, meaning the main outcome measured was the death rate from any cause among participants.
  • Objective: The trial aimed to determine if there was a significant reduction in mortality among patients treated with the aldosterone receptor blocker compared to those given a placebo.

3. Statistics and Accrual

  • Anticipated Accrual Rate: The trial planned to enroll 960 patients per year, aiming for a large sample size to ensure robust statistical power.
  • Hazard Rate: The hazard rate for the placebo group was 38%, indicating the expected proportion of patients experiencing the endpoint (death) during the study period.
  • Power and Significance: The trial designers wanted a 90% power to detect a 17% reduction in the hazard rate, with a one-sided alpha level of 0.025. This statistical setup means they were 90% confident to detect a reduction in mortality of at least 17% if such a reduction truly exists, with a 2.5% chance of incorrectly concluding there is an effect (Type I error).

4. Interim Monitoring and Group Sequential Design

  • Six-Look Group Sequential Design: The design allowed for accruing efficacy data to be monitored up to six times throughout the study duration. This approach helps in identifying treatment effects earlier and can adjust or stop the trial based on interim findings.
  • DSMB Meetings: Six Data Safety and Monitoring Board (DSMB) meetings were planned for interim reviews, which are critical in ensuring the safety of participants as the trial progresses.

5. Outcome and Trial Stopping

  • Stopping Early: The trial was stopped early at the fifth interim analysis (look-5) by the DSMB. Stopping early for efficacy indicates that the treatment’s benefits were clear and substantial enough to conclude the trial ahead of schedule, potentially changing clinical practice sooner.
  • Time Saved: The early stop saved nearly two years of study duration, which was originally projected to end in June 2000 but concluded in August 1998 instead.
  • Impact: Early stopping of the trial for efficacy suggests that the aldosterone blocker significantly reduced mortality in the study population. This finding likely had a profound impact on subsequent clinical guidelines and patient care, especially for conditions like heart failure where aldosterone plays a crucial role.

Case 2: Sample Size Re-Estimation due to Uncertainty about Nuisance Parameters

Psoriasis Example

The Psoriasis trial example involves: - Primary Endpoint: Achievement of PASI-75 by week 16, which measures improvement in psoriasis. - Design Parameters: Designed for 95% power to detect a 10% improvement with a new treatment relative to placebo, with uncertainty about the placebo response rate (π_c = 7.5%).

The problem here is that the power of the trial depends on both the actual placebo response rate (π_c) and the effect size (δ), which can be unknown and vary. If π_c or δ are misestimated, it can impact the trial’s power, making the originally calculated sample size insufficient or excessive.

Strategy

By using an information-based design, the trial is allowed to adapt by recalculating the necessary sample size based on accruing data about the actual placebo rate and effect size. This can be done through interim analyses, where the actual information accrued (J_j) is compared against the pre-specified maximum information \(I_{\text{max}}\). If \(J_j\) meets or exceeds \(I_{\text{max}}\), or efficacy boundaries are crossed, the trial might be stopped early for efficacy or futility, or the sample size adjusted to meet the desired power.

This approach proposes using “statistical information” rather than fixed sample sizes to guide the monitoring and conclusion of clinical trials. The rationale here is to accumulate enough information to make robust statistical decisions, thereby potentially making the trial more efficient and flexible.

This approach is particularly beneficial in scenarios like the psoriasis trial where there is considerable uncertainty about critical parameters that influence study outcomes. It allows the study to adapt to the observed data, making it potentially more efficient and likely to reach conclusive results.

Formula for Maximum Statistical Information

\[ I_{\text{max}} = \left( \frac{Z_{\alpha/2} + Z_{\beta}}{\delta} \right)^2 \times \text{Inflation Factor} \]

  • Z_α/2 and Z_β: These represent the critical values from the normal distribution for the type I error rate (α) and the power (1-β), respectively.
  • δ: This is the expected treatment effect size.
  • Inflation Factor: This factor accounts for adjustments in the design, like those due to interim looks in a group sequential design, which might inflate the required information due to the increased chance of type I error.

The table indicates that irrespective of the true placebo rate (π_c), the maximum statistical information \(I_{\text{max}}\) remains constant, suggesting the sample size (N_max) adjusts according to the variability observed due to π_c.

Calculation of Information at Each Look (J)

\[ J_j = \left[ \text{se}(\delta)^{-1} \right]^2 \left[ \frac{\hat{\pi}_c (1 - \hat{\pi}_c) + \hat{\pi}_e (1 - \hat{\pi}_e)}{N/2} \right]^{-1/2} \] - se(δ)^{-1: Represents the precision (or inverse of the standard error) of the estimated treatment effect. - N/2: Assumes an equal split of the sample size between treatment and control groups. - π_e and π_c: Estimated rates of the endpoint for the experimental and control groups, respectively.

Case 3: Sample Size Re-Estimation due to Uncertainty about Treatment Effect

Sample size re-estimation (SSR) in clinical trials is a strategic approach employed when initial assumptions about a study need adjustment based on interim data. This can be crucial for ensuring the scientific validity and efficiency of a trial.

Reasons for Sample Size Re-Estimation

  1. Insufficient Data on New Product:
    • Example: A novel treatment, like a new monoclonal antibody for Covid, hasn’t been extensively tested in the target population. Initial trials may start with conservative estimates that need adjustment as more specific data about the drug’s efficacy and safety profile are collected.
  2. Evolving Standard of Care:
    • Example: If the standard of care improves during the trial due to advancements in the control arm treatments, the relative benefit of the new therapy may appear reduced. This could necessitate a larger sample size to detect the true effect of the new treatment under the new standards.
  3. Phased Investment Strategy:
    • Example: Sponsors may choose to start with a smaller scale to manage risks and then decide to expand the trial after promising interim results. This strategy is common in phases of drug development where the commitment of resources is contingent on early indications of success.

Schizophrenia Example

  • Trial Details: A new drug is tested against a placebo for treating negative symptoms of schizophrenia, focusing on the Negative Symptoms Assessment (NSA) index.
  • Initial Powering: The trial is initially powered to detect a 2-point difference in the NSA index with a standard deviation (σ) of 7.5.
  • Reassessment: Interim data may prompt re-evaluation of these parameters to ensure the trial’s continued relevance and accuracy in its conclusions.

During interim analysis, questions might arise such as: - Should the study continue to target a difference (δ) of 2 points? - Is the assumed standard deviation (σ = 7.5) still valid?

To address these questions: - Conditional Power (CP): This is the probability that the study will detect the predefined effect size, given the interim results. Adjustments might be made to increase the sample size to enhance CP. - Adjusting Critical Cut-off: To maintain the integrity of the type-1 error rate, the critical cut-off value for stopping the trial might need adjustment.

Increasing Sample Size to Boost Conditional Power (CP)

  • Conditional Power (CP): This is defined as the probability, given the data observed so far (denoted by \(z_1\)), that the final test statistic \(Z_2\) will exceed a certain critical value \(c\) assuming the alternative hypothesis \(\delta\) is true. In simpler terms, CP measures the likelihood that a study will achieve its objectives (e.g., proving treatment efficacy) given the results observed at an interim analysis.
  • Formula: \(CP = P_\delta(Z_2 \geq c | z_1)\)
    • \(P_\delta\) indicates the probability under the alternative hypothesis.
    • \(Z_2\) is the final test statistic.
    • \(c\) is the critical value for concluding statistical significance.
    • \(z_1\) represents the interim results.

Increasing the sample size can boost CP because it typically reduces the variance of the test statistic, making it more likely that \(Z_2\) will exceed \(c\).

Adjustment of Critical Cut-off

  • Type-1 Error and SSR: When you adjust the sample size based on interim results, known as Sample Size Re-estimation (SSR), there’s a potential risk of inflating the type-1 error rate (the probability of incorrectly rejecting the null hypothesis). This is due to the increased chance of observing extreme values simply because more data are being examined.
  • Critical Cut-off Adjustment: To address this, the critical cut-off \(c\) used to determine the significance of results at the end of the trial needs adjustment. The new cut-off \(c^*\) must be determined such that the probability of a type-1 error given the adjusted sample size and interim results does not exceed the original planned type-1 error probability.
  • Formula: \(P_0(Z_2^* \geq c^* | z_1) \leq P_0(Z_2 \geq c | z_1)\)
    • \(P_0\) represents the probability under the null hypothesis.
    • \(Z_2^*\) is the new test statistic considering the adjusted sample size.
    • \(c^*\) is the new critical value.

This adjustment ensures that even with an altered trial design, the integrity of the study’s conclusions remains sound. The statistical methodology aims to maintain the trial’s power (ability to detect a true effect) without compromising its rigor due to potential overestimation of the type-1 error.

Typical SSR Rules

  1. Cap on Increases: Often, increases in sample size are capped (e.g., no more than double the initial size) to prevent logistical and financial overextension.

  2. Zones of Adjustment:

    • Unfavorable Zone: If CP is below 30%, it might be decided not to increase the sample size.
    • Promising Zone: If CP is between 30% and 80%, the sample size might be adjusted to aim for approximately 90% CP, within the limits of the predefined cap.
    • Favorable Zone: If CP exceeds 80%, the current sample size is usually deemed sufficient, and no changes are made.

Case 4: Seamless Phase II/III Trial with Dose Selection

Crofelemer Study

  • Prevalence and Impact: Diarrhea affects 20-30% of HIV-infected individuals, posing a significant health burden. This condition complicates the management of HIV by impairing quality of life and complicating compliance with antiretroviral medications.
  • Consequences of Noncompliance: Noncompliance with antiretroviral regimens can lead to reduced drug levels, increased viral loads, and the development of drug resistance, further complicating treatment and disease progression.
  • Need for Effective Treatment: Effective management of diarrhea in HIV-infected patients is crucial as it could potentially improve overall HIV treatment outcomes by enhancing compliance and reducing complications related to elevated viral loads and drug resistance.
  • Earlier Study Findings: An initial 7-day study showed promising results, justifying the progression to a more extended and rigorous 28-day phase 3 trial to confirm efficacy and safety.
  • Trial Endpoint: The primary endpoint for the phase 3 trial is a binary measure: fewer than three watery bowel movements per week over a four-week period. This endpoint is chosen to quantitatively assess the improvement in diarrhea and thereby the potential improvement in patient quality of life and compliance with HIV treatment.
  • Dose Exploration: The optimal dose of crofelemer is unknown, leading to the decision to conduct pairwise comparisons among three dosage levels (125 mg, 250 mg, and 500 mg) versus a placebo. This approach allows the trial to evaluate the efficacy and safety profile of each dose relative to placebo.

Statistical Considerations

  • Placebo Response and Treatment Effect: The expected placebo response rate is 35%, with an anticipated 20% improvement with crofelemer treatment. These assumptions are critical for calculating the necessary sample size and for power calculations to ensure the study is adequately powered to detect a clinically meaningful effect.

  • Implications for Sample Size Re-Estimation: Given the uncertainty in the optimal dose and variability in the placebo response, an adaptive trial design with sample size re-estimation could be considered. This approach would allow adjustments based on interim analysis results, potentially optimizing the study design in real-time to ensure sufficient power and minimize unnecessary exposure to less effective doses.

  • Interim Analyses: Conducting interim analyses would allow for the assessment of preliminary efficacy and safety data. Based on these data, decisions could be made about continuing, modifying, or stopping the trial for futility or efficacy.

  • Adjustments Based on Conditional Power: If interim results suggest changes in the estimated placebo response or differentially greater efficacy at specific doses, the sample size could be adjusted to ensure that the study remains adequately powered to detect significant treatment effects.

Options 1: A Single 4-Arm Trial

  • Design: This design involves a single phase 3 trial with four arms, including three different doses of a treatment (125 mg, 250 mg, and 500 mg bid) and a placebo.
  • Multiplicity Correction: The Bonferroni-Holm method is used to adjust for the multiple comparisons made between each treatment dose and the placebo. This is important to maintain the integrity of the type-1 error rate across multiple hypothesis tests.
  • Sample Size: Requires 520 patients for 80% power at a 1-sided α = 0.025.
  • Pros and Cons: This design is straightforward and allows simultaneous comparison of all doses against placebo but requires a large sample size and stringent corrections for multiple testing which could reduce power.

Option 2: Operationally Seamless Phase 2-3 Design

  • Design: This is a two-stage design where the first trial (Phase 2) is used for dose selection from three different doses, and the second trial (Phase 3) confirms the efficacy of the selected dose against placebo.
  • Flexibility: Allows for flexibility in dose selection based on the results of the first trial.
  • Data Use: Data from Trial 1 cannot be used for the confirmatory test in Trial 2, which may lead to inefficiencies.
  • Sample Size: Requires 416 patients for 80% power.
  • Pros and Cons: Offers operational efficiency by seamlessly progressing from dose-finding to confirmation but involves higher total patient numbers due to the need for a confirmatory trial.

Option 3: Inferentially Seamless Phase 2-3 Design

  • Design: This design also features two stages but differs from Option 2 by allowing the use of data from both stages in the final analysis.
  • Multiplicity Control: Uses a combination of p-values and closed testing procedures to control the family-wise error rate (FWER) while accounting for the use of data across both stages.
  • Sample Size: Requires 380 patients for 80% power, offering a more efficient use of data and potentially lower total patient numbers.
  • Pros and Cons: Enhances efficiency and may reduce overall trial duration and patient exposure. However, it requires sophisticated statistical techniques to ensure proper control of error rates.

Multiplicity Control

  • Inverse Normal Combination: Combines p-values from different stages of the study to form a single test statistic, using weighted contributions based on the sample sizes of each stage.
  • Closed Testing: Involves testing a family of hypotheses (e.g., for each dose) using a hierarchical or closed set approach. To reject any elementary hypothesis, all intersecting hypotheses must also be rejected, which strictly controls the FWER.

1. Inverse Normal Combination of Stage 1 and Stage 2 p-values

This method combines the p-values from different stages of the study using a weighted Z-transform approach. The formula provided:

\[ Z_2 = \sqrt{\frac{n_1}{n_1 + n(2)}} \phi^{-1}(1 - p_1) + \sqrt{\frac{n(2)}{n_1 + n(2)}} \phi^{-1}(1 - p_2) \]

  • Variables:
    • \(n_1\) and \(n(2)\) are the sample sizes from stage 1 and stage 2, respectively.
    • \(p_1\) and \(p_2\) are the p-values from stage 1 and stage 2, respectively.
    • \(\phi^{-1}\) is the inverse of the standard normal cumulative distribution function (CDF), which converts p-values into Z-scores.
  • Process:
    • The p-values are first converted into Z-scores.
    • These Z-scores are then weighted by the proportion of the total sample size contributed by each stage.
    • The weighted Z-scores are summed to produce a combined Z-score, \(Z_2\), which is used to determine the overall significance across stages.

This method assumes that combining information across stages can lead to a more powerful test while still controlling for Type I error, provided the combination rule is properly calibrated.

2. Closed Testing

Closed testing is a rigorous method for controlling FWER in the context of multiple hypothesis testing, especially when tests are not independent.

  • Elementary Hypotheses:
    • The hypotheses are represented by \(H_1, H_2, H_3,\) etc.
    • The intersections \(H_1 \cap H_2, H_1 \cap H_3,\) etc., represent the joint consideration of multiple hypotheses.
  • Testing Process:
    • A “closed” set of tests is constructed, including all individual hypotheses and their intersections.
    • To reject any elementary hypothesis, say \(H_1\), all higher-order intersections involving \(H_1\) must also be rejected at the predefined alpha level.
    • This method ensures that FWER is controlled under the alpha level across all hypotheses, as rejecting any single hypothesis requires a stringent criterion that all related intersections are also significant.
  • Statistical Rigor:
    • This approach is particularly strict and can be conservative, but it ensures that the overall Type I error rate across multiple tests does not exceed the designated level, α.

Case 5: Adaptive Multi-arm Multi-stage (MAMS) Design

Overview of Adaptive MAMS Design

Features of MAMS:

  1. Multiple Treatment Arms: Involves comparing several treatment options against a common control group, allowing simultaneous evaluation of multiple interventions.

  2. Multiple Interim Analyses: Scheduled assessments of the accumulating data at multiple points during the trial. These interim looks allow for early decisions about the continuation, modification, or termination of treatment arms.

  3. Early Stopping Rules: The trial can be stopped early for efficacy if a treatment shows clear benefit, or for futility if it’s unlikely to show benefit by the end of the study.

  4. Continuation with Multiple Winners: Unlike traditional designs that might stop after finding one effective treatment, MAMS design can continue to evaluate other promising treatments.

  5. Dropping Losers: Ineffective treatment arms can be discontinued at interim stages, focusing resources on more promising treatments.

  6. Dose Selection: Flexibility to adjust doses or select the most effective dose based on interim results.

  7. Sample Size Re-estimation (SSR): Sample sizes can be recalculated based on interim data to ensure adequate power is maintained throughout the trial, especially useful if initial estimates of effect size (δ) or variability (σ) are inaccurate.

  8. Control of Type-1 Error: Despite the complexity and multiple hypothesis testing involved, the design includes methodologies to maintain strong control over the type-1 error rate, ensuring the validity of the trial’s conclusions.

Example: SOCRATES Reduced Trial

Trial Details: - Intervention: Evaluated three doses of Variciguate compared to placebo. - Primary Endpoint: Week-12 reduction in the log of NT-proBNP, a biomarker used to assess heart function and heart failure. - Sample Size and Power: A total of 388 patients to achieve 80% power for detecting a change of δ = 0.187 in the log NT-proBNP, assuming a standard deviation (σ) of 0.52.

Adaptive Features: - Adaptive Design Considerations: The trial was prepared to adjust for different values of δ and σ than initially estimated, which is crucial if the biological effect of Variciguate or the variability in NT-proBNP measurements was misestimated. - Interim Analyses with SSR and Drop the Loser: The design included provisions for interim analyses to reassess the continued relevance of each dose. Less promising doses could be dropped (‘Drop the Loser’), and the sample size could be recalculated based on the data gathered to that point (‘SSR’).

Generalization of Two-Arm Group Sequential Design

  1. Wald Statistic Calculation:
    • For each treatment arm compared to placebo, a Wald statistic is calculated at each interim analysis (look j), where i denotes the treatment arm.
    • Formula: \(Z_{ij} = \frac{\hat{\delta}_{ij}}{se(\hat{\delta}_{ij})}\)
      • \(\hat{\delta}_{ij}\) represents the estimated effect size difference between the treatment and placebo at look j.
      • \(se(\hat{\delta}_{ij})\) is the standard error of the estimated effect size.
  2. Multiplicity-Adjusted Boundaries:
    • Adjusted boundaries \(u_{j}\) are set to control the Type-1 error while allowing for multiple looks at the data.
    • The probability under the null hypothesis that the maximum Wald statistic across all looks and treatment arms exceeds the boundary \(u_{j}\) should equal \(\alpha\), ensuring overall Type-1 error control.
  3. Correlation Structure:
    • The correlation between the Wald statistics across different looks and doses is considered and is based on the ratio of the square roots of the respective sample sizes.
    • This correlation needs to be accounted for in the statistical analysis to maintain accurate control of Type-1 error rates.

Strong FWER Control under Adaptations

  1. Adaptations in Trial Design:
    • Treatment Selection Only: No closed testing is required, simplifying the design.
    • Additional Adaptations: If sample size re-estimation, adjustment in the number or spacing of future looks, or changes in the error spending function occur, closed testing and preservation of conditional error rates are required.
  2. Closed Testing:
    • Ensures that adaptations do not inflate the Type-1 error rate. More powerful than seamless Phase II/III designs but requires detailed knowledge of the correlation structure among the test statistics to effectively control the FWER.

Power Gain of MAMS Design over Seamless II/III Design

  1. Comparison of Power Gain:
    • The table compares the power differences between the MAMS design and a standard Seamless II/III design using different methods of multiplicity adjustment (Bonferroni, Simes, and Dunnett).
    • Different scenarios of dose effect assumptions are presented (all doses effective, some doses ineffective), showing that the MAMS design generally offers higher power due to its flexibility in dropping ineffective doses and focusing on promising treatments.

  1. Dose Dropping Criteria:
    • Lists criteria such as any \(\delta_{i1} < 0\), \(\delta_{i1} < -\sigma\), and \(\delta_{i1} < -2\sigma\) to guide dropping ineffective doses, demonstrating how these criteria impact the overall power of the trial.

Case 6: Population Enrichment Design

TAPPAS Trial Overview

Treatment Arms: - TRC105 + Pazopanib: TRC105 targets the endoglin receptor and is combined with Pazopanib, which targets the VEGF receptor. - Pazopanib Alone: Standard of care, serving as a control.

Subgroups: - Two primary subgroups, cutaneous and visceral. The cutaneous subgroup is notably more sensitive to TRC105, suggesting a potential for subgroup-specific efficacy.

Interim Decisions Based on Interim Analysis: - Favorable Results: If the interim results are favorable, the trial continues as planned. - Promising but Uncertain Results: If results are promising but not conclusively favorable, the trial may adapt by increasing the sample size to enhance statistical power. - Unfavorable Results for Combined Therapy: The trial continues as planned or stops for futility based on specific interim findings. - Population Enrichment: If the interim results suggest that the cutaneous subgroup is particularly responsive, the trial may shift its focus to this subgroup, enriching the patient population to those most likely to benefit.

Inverse Normal Combination of P-Values

Statistical Methodology for Decision Making: - Combining P-values: The method involves using weighted Z-transforms of p-values obtained before and after the interim analysis. - Weights: \(w_1\) and \(w_2\) are weights assigned to the p-values from different stages of the trial, reflecting their importance or reliability.

1. In Case of No Enrichment

When there is no enrichment, i.e., the trial continues with the full patient population, the significance is declared if: \[ w_1 \Phi^{-1}(1 - p_1^{FS}) + w_2 \Phi^{-1}(1 - p_2^{FS}) \geq Z_{\alpha} \] \[ w_1 \Phi^{-1}(1 - p_1^F) + w_2 \Phi^{-1}(1 - p_2^F) \geq Z_{\alpha} \]

Where: - \(\Phi^{-1}\) is the inverse of the standard normal cumulative distribution function. - \(p_1^{FS}\) and \(p_2^{FS}\) are the p-values for the full sample from stages 1 and 2, respectively, after the interim analysis. - \(p_1^F\) and \(p_2^F\) are the p-values for the full sample from stages 1 and 2, respectively, before the interim analysis. - \(w_1\) and \(w_2\) are the weights assigned to the p-values from each respective stage. - \(Z_{\alpha}\) is the critical value from the standard normal distribution corresponding to the desired overall Type I error rate, \(\alpha\).

2. In Case of Enrichment

When the trial opts for enrichment, i.e., focusing on a specific subgroup (e.g., the cutaneous subgroup) after finding differential treatment effects, the significance is declared if: \[ w_1 \Phi^{-1}(1 - p_1^{FS}) + w_2 \Phi^{-1}(1 - p_2^{FS}) \geq Z_{\alpha} \] \[ w_1 \Phi^{-1}(1 - p_1^S) + w_2 \Phi^{-1}(1 - p_2^S) \geq Z_{\alpha} \]

Where: - \(p_1^S\) and \(p_2^S\) are the p-values from the enriched subgroup (e.g., cutaneous) from stages 1 and 2, respectively.

Strategic Recruitment Approach - Rare Disease

Recruitment is very challenging due to rare disease. Easier to start small and ask for more. Given the rarity of the disease and the challenges in recruitment: - Start with an initial sample size of 125: This allows the trial to begin with a manageable cohort and adjust based on early insights. - Adaptations Based on Zones: - If in promising zone: Increase to 200 to enhance statistical power and confirm preliminary findings. - If in enrichment zone: Increase to 180, concentrating more on the subgroup showing better responsiveness. - If in favorable zone: Maintain the current sample size as the efficacy is already well demonstrated, potentially accelerating the trial conclusion.

Zone: Categorizes the possible outcomes of interim analysis into four zones: - Enrich: Indicates scenarios where focusing on a specific subgroup (e.g., cutaneous) might be beneficial. - Unfav (Unfavorable): Situations where results are not promising, potentially leading to stopping the trial for futility. - Prom (Promising): Interim results suggest potential efficacy that could be confirmed with a larger sample size. - Favor (Favorable): Strong evidence of efficacy as planned, potentially moving towards a quicker conclusion or regulatory submission.

Source

Cytle Webinars: Introduction to Design and Monitoring of Adaptive Clinical Trials